Unlabeled Compression Schemes for Maximum Classes,
نویسندگان
چکیده
Maximum classes of domain size n and VC dimension d have ( n ≤d ) concepts, and this is an upper bound on the size of any such class. We give a compression scheme for any maximum class that represents each concept by a subset of up to d unlabeled domain points and has the property that for any sample of a concept in the class, the representative of exactly one of the concepts consistent with the sample is a subset of the domain of the sample. This allows us to compress any sample of a concept in the class to a subset of up to d unlabeled sample points such that this subset represents a concept consistent with the entire original sample. Unlike the previously known compression scheme for maximum classes (Floyd and Warmuth, 1995) which compresses to labeled subsets of the sample of size equal d, our new scheme is tight in the sense that the number of possible unlabeled compression sets of size at most d equals the number of concepts in the class.
منابع مشابه
Labeled Compression Schemes for Extremal Classes
It is a long-standing open problem whether there exists a compression scheme whose size is of the order of the VapnikChervonienkis (VC) dimension d. Recently compression schemes of size exponential in d have been found for any concept class of VC dimension d. Previously, compression schemes of size d have been given for maximum classes, which are special concept classes whose size equals an upp...
متن کاملRecursive Teaching Dimension, Learning Complexity, and Maximum Classes
This paper is concerned with the combinatorial structure of concept classes that can be learned from a small number of examples. We show that the recently introduced notion of recursive teaching dimension (RTD, reflecting the complexity of teaching a concept class) is a relevant parameter in this context. Comparing the RTD to self-directed learning, we establish new lower bounds on the query co...
متن کاملRecursive teaching dimension, VC-dimension and sample compression
This paper is concerned with various combinatorial parameters of classes that can be learned from a small set of examples. We show that the recursive teaching dimension, recently introduced by Zilles et al. (2008), is strongly connected to known complexity notions in machine learning, e.g., the self-directed learning complexity and the VC-dimension. To the best of our knowledge these are the fi...
متن کاملGeometric & Topological Representations of Maximum Classes with Applications to Sample Compression
We systematically investigate finite maximum classes, which play an important role in machine learning as concept classes meeting Sauer’s Lemma with equality. Simple arrangements of hyperplanes in Hyperbolic space are shown to represent maximum classes, generalizing the corresponding Euclidean result. We show that sweeping a generic hyperplane across such arrangements forms an unlabeled compres...
متن کاملA Geometric Approach to Sample Compression
The Sample Compression Conjecture of Littlestone & Warmuth has remained unsolved for over two decades. While maximum classes (concept classes meeting Sauer’s Lemma with equality) can be compressed, the compression of general concept classes reduces to compressing maximal classes (classes that cannot be expanded without increasing VCdimension). Two promising ways forward are: embedding maximal c...
متن کامل